Search Results

Documents authored by Cunha, Luís Filipe


Document
Reasoning with Portuguese Word Embeddings

Authors: Luís Filipe Cunha, J. João Almeida, and Alberto Simões

Published in: OASIcs, Volume 104, 11th Symposium on Languages, Applications and Technologies (SLATE 2022)


Abstract
Representing words with semantic distributions to create ML models is a widely used technique to perform Natural Language processing tasks. In this paper, we trained word embedding models with different types of Portuguese corpora, analyzing the influence of the models' parameterization, the corpora size, and domain. Then we validated each model with the classical evaluation methods available: four words analogies and measurement of the similarity of pairs of words. In addition to these methods, we proposed new alternative techniques to validate word embedding models, presenting new resources for this purpose. Finally, we discussed the obtained results and argued about some limitations of the word embedding models' evaluation methods.

Cite as

Luís Filipe Cunha, J. João Almeida, and Alberto Simões. Reasoning with Portuguese Word Embeddings. In 11th Symposium on Languages, Applications and Technologies (SLATE 2022). Open Access Series in Informatics (OASIcs), Volume 104, pp. 17:1-17:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{cunha_et_al:OASIcs.SLATE.2022.17,
  author =	{Cunha, Lu{\'\i}s Filipe and Almeida, J. Jo\~{a}o and Sim\~{o}es, Alberto},
  title =	{{Reasoning with Portuguese Word Embeddings}},
  booktitle =	{11th Symposium on Languages, Applications and Technologies (SLATE 2022)},
  pages =	{17:1--17:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-245-7},
  ISSN =	{2190-6807},
  year =	{2022},
  volume =	{104},
  editor =	{Cordeiro, Jo\~{a}o and Pereira, Maria Jo\~{a}o and Rodrigues, Nuno F. and Pais, Sebasti\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2022.17},
  URN =		{urn:nbn:de:0030-drops-167636},
  doi =		{10.4230/OASIcs.SLATE.2022.17},
  annote =	{Keywords: Word Embeddings, Word2Vec, Evaluation Methods}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail